Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/put on busy shards #2965

Merged
merged 3 commits into from
Nov 22, 2024
Merged

Fix/put on busy shards #2965

merged 3 commits into from
Nov 22, 2024

Conversation

carpawell
Copy link
Member

No description provided.

@carpawell carpawell self-assigned this Oct 7, 2024
Copy link

codecov bot commented Oct 7, 2024

Codecov Report

Attention: Patch coverage is 32.69231% with 35 lines in your changes missing coverage. Please review.

Project coverage is 22.87%. Comparing base (f1b6982) to head (3024773).
Report is 22 commits behind head on master.

Files with missing lines Patch % Lines
pkg/local_object_storage/engine/put.go 32.55% 29 Missing ⚠️
pkg/local_object_storage/engine/engine.go 0.00% 4 Missing ⚠️
cmd/neofs-node/config.go 0.00% 1 Missing ⚠️
cmd/neofs-node/storage.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2965      +/-   ##
==========================================
+ Coverage   22.85%   22.87%   +0.01%     
==========================================
  Files         791      791              
  Lines       58603    58688      +85     
==========================================
+ Hits        13395    13425      +30     
- Misses      44312    44366      +54     
- Partials      896      897       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@carpawell carpawell force-pushed the fix/put-on-busy-shards branch 2 times, most recently from 13b546f to 12246c2 Compare October 16, 2024 15:37
@carpawell carpawell marked this pull request as ready for review October 16, 2024 15:40
@carpawell
Copy link
Member Author

Well, it works. There is not much we can do about testing currently. It is hard to test in general because it is time-dependent. But it is quite straightforward code, and the tests are okay.

@carpawell
Copy link
Member Author

Copy link
Member

@roman-khimov roman-khimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an improvement, but channels/routines would be even better.

}

func (e *StorageEngine) putToShardWithDeadLine(sh hashedShard, ind int, pool util.WorkerPool, addr oid.Address, prm PutPrm) error {
const deadline = 30 * time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it configurable? Treat zero as old behavior?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean node's config? one more value that no one knows what to pick?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. This directly affects the reply latency. The other option of course is to check if we have any other timeout value that can be reused.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np

pkg/local_object_storage/engine/put.go Outdated Show resolved Hide resolved
pkg/local_object_storage/engine/put.go Outdated Show resolved Hide resolved
pkg/local_object_storage/engine/put.go Outdated Show resolved Hide resolved
@carpawell
Copy link
Member Author

but channels/routines would be even better

@roman-khimov, what do you mean by that? Sending tasks via channels to every shard? What is the difference?

@roman-khimov
Copy link
Member

How can I get a "block for 30s" behavior with a pool? How can I get a queue with a pool? That's the difference.

@roman-khimov
Copy link
Member

MaxBlockingTasks allows to have some queue, maybe we can expose it through the config and test the behavior this way? I think we can have active threads with bigger queues at shard level (although it depends SSD/HDD) instead of current huge pools and no queue. Time to poll-block can be tuned too, we can make additional tests to see how it affects operation.

If every shard's pool is overloaded with routines, choose the best one and try
to PUT an object to it 30 seconds. Closes #2871.

Signed-off-by: Pavel Karpy <[email protected]>
Missing (zero) value just keeps the old behavior (no retries at all).

Signed-off-by: Pavel Karpy <[email protected]>
@roman-khimov roman-khimov merged commit 2bb903c into master Nov 22, 2024
22 checks passed
@roman-khimov roman-khimov deleted the fix/put-on-busy-shards branch November 22, 2024 07:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants